Start / USEFUL LISTS / Arco /

Linking Iconclass to the Italian National Cultural Heritage Catalogue: a lightweight LOD bridge

Synopsis

This article describes how a simple intermediate web page can function as a bridge between the Iconclass browser and the Italian National Cultural Heritage Catalogue (catalogo.beniculturali.it), allowing users navigating iconographic notations to discover which objects in the Italian national collection share the same iconographic subject.

Background

Iconclass is a widely used iconographic classification system, originally developed by Henri van de Waal and now maintained as a linked data resource. Each notation — such as 73D6 (The Last Supper) or 11F4 (the Virgin Mary) — identifies a specific iconographic concept and is associated with a stable URI of the form http://iconclass.org/{notation}.

The Italian National Cultural Heritage Catalogue is maintained by the ICCD (Istituto Centrale per il Catalogo e la Documentazione), part of the Italian Ministry of Culture. Its linked data representation, known as ARCo (Architecture of Knowledge for Cultural Heritage), is published as an RDF knowledge graph queryable via a public SPARQL endpoint at https://dati.cultura.gov.it/sparql. ARCo is aligned to CIDOC-CRM and uses its own ontology namespace at https://w3id.org/arco/.

The linking problem

A direct link from an Iconclass notation page to the Italian catalogue is not straightforward for two reasons.

First, the catalogue's public website does not expose Iconclass as a URL parameter in its search interface, so there is no single URL of the form catalogo.beniculturali.it/search?iconclass=73D6 to link to.

Second, while the ARCo knowledge graph does store Iconclass notations — using the dedicated property a-cd:iconclassCode (https://w3id.org/arco/ontology/context-description/iconclassCode) — the values are stored as plain string literals linked to Iconclass URIs, not as owl:sameAs triples. A SPARQL query over the public endpoint can retrieve all records for a given notation, but executing such a query directly from a browser page is blocked by CORS (Cross-Origin Resource Sharing) restrictions on the endpoint.

Third, displaying a flat list of potentially hundreds of object links directly on an Iconclass notation page would clutter the browser interface.

The solution: a static intermediate site

The approach described here avoids all three problems with a pre-built static website that acts as an intermediate layer between Iconclass and the catalogue.

Data extraction

The starting point is an N-Triples (.nt) file exported from the ARCo dataset, containing triples of the form:

<https://w3id.org/arco/resource/HistoricOrArtisticProperty/1100368968>
<https://w3id.org/arco/ontology/context-description/iconclassCode>
<http://iconclass.org/11F4>

This file, containing 94,441 triples covering 18,326 unique Iconclass notations, is parsed in Python to group ARCo resource URIs by notation. For each ARCo URI, the corresponding catalogue record URL is derived by a simple string substitution:

https://w3id.org/arco/resource/HistoricOrArtisticProperty/{id}
→ https://catalogo.beniculturali.it/detail/HistoricOrArtisticProperty/{id}

File structure

The result is a static site consisting of three components:

arco-index.json — a single JSON file (~1.5 MB) mapping every Iconclass notation to a record count and a filename, used to power the search interface without loading all data upfront.
data/{hash}.json — one small JSON file per notation (18,326 files total), each containing the list of ARCo and catalogo URLs for that notation. Files are named using the MD5 hash of the notation string to avoid filesystem issues with special characters (parentheses, accented characters, and long notation strings are common in Iconclass).
index.html — a single self-contained HTML page that loads the index on startup, filters notations in real time as the user types, and fetches detail files on demand when a notation is expanded.

Linking

From any Iconclass notation page, a link of the form:

https://{host}/arco/index.html#73D6

opens the intermediate page with 73D6 pre-filled in the search box, the matching result automatically expanded, and the page scrolled to it. The hash fragment is used rather than a query parameter so that the link works with any static file host without server-side routing configuration.

Technical characteristics

Property	Value
Source triples	94,441
Unique Iconclass notations	18,326
Total archive size (compressed)	~2 MB
Server requirements	None (static files only)
SPARQL dependency at runtime	None
Data freshness	Determined by re-export frequency

Because the site is entirely static, it can be hosted on GitHub Pages, any CDN, or a simple file server, with no database or server-side scripting required. Each notation's detail file is fetched only when a user opens that notation, keeping initial load fast regardless of the total dataset size.

For reference, the intermediate page also provides a link to the live SPARQL query for each notation, allowing technically minded users to query the endpoint directly and retrieve additional metadata such as object labels.

Limitations and future work

The current implementation links to catalogue records by their numeric identifier, without displaying the object's title or thumbnail. Enriching the N-Triples export with rdfs:label values and image URLs would make the intermediate page significantly more useful to end users.

The notation-to-URI mapping in ARCo is currently one-directional: the Iconclass URI appears as the object of the iconclassCode triple, but there are no owl:sameAs statements linking ARCo resources to http://iconclass.org/ URIs in the published graph. Establishing such links formally would enable richer federation across LOD datasets.

Finally, the static approach requires periodic regeneration of the data files to stay current with updates to the ARCo dataset. This is straightforwardly automated with a scheduled script that re-exports the N-Triples file and rebuilds the index.